# WACV-2024-Papers

<table>
    <tr>
        <td><strong>Application</strong></td>
        <td>
            <a href="https://huggingface.co/spaces/DmitryRyumin/NewEraAI-Papers" style="float:left;">
                <img src="https://img.shields.io/badge/🤗-NewEraAI--Papers-FFD21F.svg" alt="App" />
            </a>
        </td>
    </tr>
</table>

<div align="center">
    <a href="https://github.com/DmitryRyumin/WACV-2024-Papers/blob/main/sections/2024/main/visualization.md">
        <img src="https://cdn.jsdelivr.net/gh/DmitryRyumin/NewEraAI-Papers@main/images/left.svg" width="40" alt="" />
    </a>
    <a href="https://github.com/DmitryRyumin/WACV-2024-Papers/">
        <img src="https://cdn.jsdelivr.net/gh/DmitryRyumin/NewEraAI-Papers@main/images/home.svg" width="40" alt="" />
    </a>
    <a href="https://github.com/DmitryRyumin/WACV-2024-Papers/blob/main/sections/2024/main/agriculture.md">
        <img src="https://cdn.jsdelivr.net/gh/DmitryRyumin/NewEraAI-Papers@main/images/right.svg" width="40" alt="" />
    </a>
</div>

## Video Recognition and Understanding

![Section Papers](https://img.shields.io/badge/Section%20Papers-61-42BA16) ![Preprint Papers](https://img.shields.io/badge/Preprint%20Papers-35-b31b1b) ![Papers with Open Code](https://img.shields.io/badge/Papers%20with%20Open%20Code-33-1D7FBF) ![Papers with Video](https://img.shields.io/badge/Papers%20with%20Video-46-FF0000)

| **Title** | **Repo** | **Paper** | **Video** |
|-----------|:--------:|:---------:|:---------:|
| [Detecting Content Segments from Online Sports Streaming Events: Challenges and Solutions](https://openaccess.thecvf.com/content/WACV2024/html/Liu_Detecting_Content_Segments_From_Online_Sports_Streaming_Events_Challenges_and_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Liu_Detecting_Content_Segments_From_Online_Sports_Streaming_Events_Challenges_and_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=kV96eFyartE) |
| [Permutation-Aware Activity Segmentation via Unsupervised Frame-to-Segment Alignment](https://openaccess.thecvf.com/content/WACV2024/html/Tran_Permutation-Aware_Activity_Segmentation_via_Unsupervised_Frame-To-Segment_Alignment_WACV_2024_paper.html) | [![WEB Page](https://img.shields.io/badge/WEB-Page-159957.svg)](https://retrocausal.ai/rp-5-permutation-aware-action-segmentation-via-unsupervised-frame-to-segment-alignment/) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Tran_Permutation-Aware_Activity_Segmentation_via_Unsupervised_Frame-To-Segment_Alignment_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2305.19478-b31b1b.svg)](http://arxiv.org/abs/2305.19478) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=ZgvbwD3h-fc) |
| [OTAS: Unsupervised Boundary Detection for Object-Centric Temporal Action Segmentation](https://openaccess.thecvf.com/content/WACV2024/html/Li_OTAS_Unsupervised_Boundary_Detection_for_Object-Centric_Temporal_Action_Segmentation_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/yl596/OTAS?style=flat)](https://github.com/yl596/OTAS) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Li_OTAS_Unsupervised_Boundary_Detection_for_Object-Centric_Temporal_Action_Segmentation_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2309.06276-b31b1b.svg)](http://arxiv.org/abs/2309.06276) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=nkXbZ_pWNec) |
| [Embodied Human Activity Recognition](https://openaccess.thecvf.com/content/WACV2024/html/Hu_Embodied_Human_Activity_Recognition_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Hu_Embodied_Human_Activity_Recognition_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=vgVseNlmWU4) |
| [Semantic-Aware Video Representation for Few-Shot Action Recognition](https://openaccess.thecvf.com/content/WACV2024/html/Tang_Semantic-Aware_Video_Representation_for_Few-Shot_Action_Recognition_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Tang_Semantic-Aware_Video_Representation_for_Few-Shot_Action_Recognition_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.06218-b31b1b.svg)](http://arxiv.org/abs/2311.06218) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=VKPUcH-O1i0) |
| [Leveraging the Power of Data Augmentation for Transformer-based Tracking](https://openaccess.thecvf.com/content/WACV2024/html/Zhao_Leveraging_the_Power_of_Data_Augmentation_for_Transformer-Based_Tracking_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/zj5559/DATr?style=flat)](https://github.com/zj5559/DATr) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Zhao_Leveraging_the_Power_of_Data_Augmentation_for_Transformer-Based_Tracking_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2309.08264-b31b1b.svg)](http://arxiv.org/abs/2309.08264) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=-q4ErdELVCY) |
| [CAMOT: Camera Angle-Aware Multi-Object Tracking](https://openaccess.thecvf.com/content/WACV2024/html/Limanta_CAMOT_Camera_Angle-Aware_Multi-Object_Tracking_WACV_2024_paper.html) | :heavy_minus_sign:  | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Limanta_CAMOT_Camera_Angle-Aware_Multi-Object_Tracking_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=TBWW9gzqIm8) |
| [Detection Defenses: An Empty Promise Against Adversarial Patch Attacks on Optical Flow](https://openaccess.thecvf.com/content/WACV2024/html/Scheurer_Detection_Defenses_An_Empty_Promise_Against_Adversarial_Patch_Attacks_on_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/cv-stuttgart/DetectionDefenses?style=flat)](https://github.com/cv-stuttgart/DetectionDefenses) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Scheurer_Detection_Defenses_An_Empty_Promise_Against_Adversarial_Patch_Attacks_on_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2310.17403-b31b1b.svg)](http://arxiv.org/abs/2310.17403) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=hvo_bxTUxXo) |
| [Repetitive Action Counting with Motion Feature Learning](https://openaccess.thecvf.com/content/WACV2024/html/Li_Repetitive_Action_Counting_With_Motion_Feature_Learning_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Li_Repetitive_Action_Counting_With_Motion_Feature_Learning_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=g3fZmkN_sqc) |
| [United we Stand, Divided we Fall: UnityGraph for Unsupervised Procedure Learning from Videos](https://openaccess.thecvf.com/content/WACV2024/html/Bansal_United_We_Stand_Divided_We_Fall_UnityGraph_for_Unsupervised_Procedure_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Bansal_United_We_Stand_Divided_We_Fall_UnityGraph_for_Unsupervised_Procedure_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.03550-b31b1b.svg)](http://arxiv.org/abs/2311.03550) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=nYsvnBepLhQ) |
| [Sequential Transformer for End-to-End Video Text Detection](https://openaccess.thecvf.com/content/WACV2024/html/Zhang_Sequential_Transformer_for_End-to-End_Video_Text_Detection_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/zjb-1/SeqVideoText?style=flat)](https://github.com/zjb-1/SeqVideoText) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_Sequential_Transformer_for_End-to-End_Video_Text_Detection_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=PBKa5bRYamk) |
| [Context in Human Action through Motion Complementarity](https://openaccess.thecvf.com/content/WACV2024/html/Dessalene_Context_in_Human_Action_Through_Motion_Complementarity_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Dessalene_Context_in_Human_Action_Through_Motion_Complementarity_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=lnrfXg0rG68) |
| [Egocentric Action Recognition by Capturing Hand-Object Contact and Object State](https://openaccess.thecvf.com/content/WACV2024/html/Shiota_Egocentric_Action_Recognition_by_Capturing_Hand-Object_Contact_and_Object_State_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Shiota_Egocentric_Action_Recognition_by_Capturing_Hand-Object_Contact_and_Object_State_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=XWL5-MGWsXo) |
| [MIDAS: Mixing Ambiguous Data with Soft Labels for Dynamic Facial Expression Recognition](https://openaccess.thecvf.com/content/WACV2024/html/Kawamura_MIDAS_Mixing_Ambiguous_Data_With_Soft_Labels_for_Dynamic_Facial_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Kawamura_MIDAS_Mixing_Ambiguous_Data_With_Soft_Labels_for_Dynamic_Facial_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=l6lfDctenK0) |
| [FRoG-MOT: Fast and Robust Generic Multiple-Object Tracking by IoU and Motion-State Associations](https://openaccess.thecvf.com/content/WACV2024/html/Ogawa_FRoG-MOT_Fast_and_Robust_Generic_Multiple-Object_Tracking_by_IoU_and_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Ogawa_FRoG-MOT_Fast_and_Robust_Generic_Multiple-Object_Tracking_by_IoU_and_WACV_2024_paper.pdf) | :heavy_minus_sign: |
| [Density-based Flow Mask Integration via Deformable Convolution for Video People Flux Estimation](https://openaccess.thecvf.com/content/WACV2024/html/Wan_Density-Based_Flow_Mask_Integration_via_Deformable_Convolution_for_Video_People_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Wan_Density-Based_Flow_Mask_Integration_via_Deformable_Convolution_for_Video_People_WACV_2024_paper.pdf) | :heavy_minus_sign: |
| [ConfTrack: Kalman Filter-based Multi-Person Tracking by Utilizing Confidence Score of Detection Box](https://openaccess.thecvf.com/content/WACV2024/html/Jung_ConfTrack_Kalman_Filter-Based_Multi-Person_Tracking_by_Utilizing_Confidence_Score_of_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/Hyonchori/ConfTrack_WACV2024?style=flat)](https://github.com/Hyonchori/ConfTrack_WACV2024) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Jung_ConfTrack_Kalman_Filter-Based_Multi-Person_Tracking_by_Utilizing_Confidence_Score_of_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=5rCce6hjkTA) |
| [CGAPoseNet+GCAN: A Geometric Clifford Algebra Network for Geometry-Aware Camera Pose Regression](https://openaccess.thecvf.com/content/WACV2024/html/Pepe_CGAPoseNetGCAN_A_Geometric_Clifford_Algebra_Network_for_Geometry-Aware_Camera_Pose_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Pepe_CGAPoseNetGCAN_A_Geometric_Clifford_Algebra_Network_for_Geometry-Aware_Camera_Pose_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=OBMUsrOPOAQ) |
| [Embedding Task Structure for Action Detection](https://openaccess.thecvf.com/content/WACV2024/html/Peven_Embedding_Task_Structure_for_Action_Detection_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/mpeven/Task-Structure-WACV-2024?style=flat)](https://github.com/mpeven/Task-Structure-WACV-2024) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Peven_Embedding_Task_Structure_for_Action_Detection_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=dgtQHcj4ZwA) |
| [Random Walks for Temporal Action Segmentation with Timestamp Supervision](https://openaccess.thecvf.com/content/WACV2024/html/Hirsch_Random_Walks_for_Temporal_Action_Segmentation_With_Timestamp_Supervision_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/RoyHirsch/RWS?style=flat)](https://github.com/RoyHirsch/RWS) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Hirsch_Random_Walks_for_Temporal_Action_Segmentation_With_Timestamp_Supervision_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=03MjSzwVDvw) |
| [MITFAS: Mutual Information based Temporal Feature Alignment and Sampling for Aerial Video Action Recognition](https://openaccess.thecvf.com/content/WACV2024/html/Xian_MITFAS_Mutual_Information_Based_Temporal_Feature_Alignment_and_Sampling_for_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/Ricky-Xian/MITFAS?style=flat)](https://github.com/Ricky-Xian/MITFAS) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Xian_MITFAS_Mutual_Information_Based_Temporal_Feature_Alignment_and_Sampling_for_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2303.02575-b31b1b.svg)](http://arxiv.org/abs/2303.02575) | :heavy_minus_sign: |
| [Do VSR Models Generalize Beyond LRS3?](https://openaccess.thecvf.com/content/WACV2024/html/Djilali_Do_VSR_Models_Generalize_Beyond_LRS3_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/YasserdahouML/VSR_test_set?style=flat)](https://github.com/YasserdahouML/VSR_test_set) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Djilali_Do_VSR_Models_Generalize_Beyond_LRS3_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.14063-b31b1b.svg)](http://arxiv.org/abs/2311.14063) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=HKFTuaZpfxs) |
| [PGVT: Pose-Guided Video Transformer for Fine-Grained Action Recognition](https://openaccess.thecvf.com/content/WACV2024/html/Zhang_PGVT_Pose-Guided_Video_Transformer_for_Fine-Grained_Action_Recognition_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_PGVT_Pose-Guided_Video_Transformer_for_Fine-Grained_Action_Recognition_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=H_P6gA4Oo0s) |
| [Differentially Private Video Activity Recognition](https://openaccess.thecvf.com/content/WACV2024/html/Luo_Differentially_Private_Video_Activity_Recognition_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Luo_Differentially_Private_Video_Activity_Recognition_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2306.15742-b31b1b.svg)](http://arxiv.org/abs/2306.15742) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=lbY_0cZorwU) |
| [Video Instance Matting](https://openaccess.thecvf.com/content/WACV2024/html/Li_Video_Instance_Matting_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/SHI-Labs/VIM?style=flat)](https://github.com/SHI-Labs/VIM) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Li_Video_Instance_Matting_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.04212-b31b1b.svg)](http://arxiv.org/abs/2311.04212) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=qa_73w40rDY) |
| [VMFormer: End-to-End Video Matting with Transformer](https://openaccess.thecvf.com/content/WACV2024/html/Li_VMFormer_End-to-End_Video_Matting_With_Transformer_WACV_2024_paper.html) | [![GitHub Page](https://img.shields.io/badge/GitHub-Page-159957.svg)](https://chrisjuniorli.github.io/project/VMFormer/) <br /> [![GitHub](https://img.shields.io/github/stars/SHI-Labs/VMFormer?style=flat)](https://github.com/SHI-Labs/VMFormer) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Li_VMFormer_End-to-End_Video_Matting_With_Transformer_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2208.12801-b31b1b.svg)](http://arxiv.org/abs/2208.12801) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=Qd1ZltMDC4I) |
| [DDAM-PS: Diligent Domain Adaptive Mixer for Person Search](https://openaccess.thecvf.com/content/WACV2024/html/Almansoori_DDAM-PS_Diligent_Domain_Adaptive_Mixer_for_Person_Search_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/mustansarfiaz/DDAM-PS?style=flat)](https://github.com/mustansarfiaz/DDAM-PS) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Almansoori_DDAM-PS_Diligent_Domain_Adaptive_Mixer_for_Person_Search_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2310.20706-b31b1b.svg)](http://arxiv.org/abs/2310.20706) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=llMQez-xOQs) |
| [Think Before you Simulate: Symbolic Reasoning to Orchestrate Neural Computation for Counterfactual Question Answering](https://openaccess.thecvf.com/content/WACV2024/html/Ishay_Think_Before_You_Simulate_Symbolic_Reasoning_To_Orchestrate_Neural_Computation_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/azreasoners/crcg?style=flat)](https://github.com/azreasoners/crcg) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Ishay_Think_Before_You_Simulate_Symbolic_Reasoning_To_Orchestrate_Neural_Computation_WACV_2024_paper.pdf) | :heavy_minus_sign: |
| [Separable Self and Mixed Attention Transformers for Efficient Object Tracking](https://openaccess.thecvf.com/content/WACV2024/html/Gopal_Separable_Self_and_Mixed_Attention_Transformers_for_Efficient_Object_Tracking_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/goutamyg/SMAT?style=flat)](https://github.com/goutamyg/SMAT) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Gopal_Separable_Self_and_Mixed_Attention_Transformers_for_Efficient_Object_Tracking_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2309.03979-b31b1b.svg)](http://arxiv.org/abs/2309.03979) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=lyR03hsvlC4) |
| [Restoring Degraded Old Films with Recursive Recurrent Transformer Networks](https://openaccess.thecvf.com/content/WACV2024/html/Lin_Restoring_Degraded_Old_Films_With_Recursive_Recurrent_Transformer_Networks_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/mountln/RRTN-old-film-restoration?style=flat)](https://github.com/mountln/RRTN-old-film-restoration) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Lin_Restoring_Degraded_Old_Films_With_Recursive_Recurrent_Transformer_Networks_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=sLXEoJkgpCQ) |
| [Holistic Representation Learning for Multitask Trajectory Anomaly Detection](https://openaccess.thecvf.com/content/WACV2024/html/Stergiou_Holistic_Representation_Learning_for_Multitask_Trajectory_Anomaly_Detection_WACV_2024_paper.html) | [![GitHub Page](https://img.shields.io/badge/GitHub-Page-159957.svg)](https://alexandrosstergiou.github.io/project_pages/TrajREC/index.html) <br /> [![GitHub](https://img.shields.io/github/stars/alexandrosstergiou/TrajREC?style=flat)](https://github.com/alexandrosstergiou/TrajREC) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Stergiou_Holistic_Representation_Learning_for_Multitask_Trajectory_Anomaly_Detection_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.01851-b31b1b.svg)](http://arxiv.org/abs/2311.01851) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=6CVtFfmq82E) |
| [Interaction Region Visual Transformer for Egocentric Action Anticipation](https://openaccess.thecvf.com/content/WACV2024/html/Roy_Interaction_Region_Visual_Transformer_for_Egocentric_Action_Anticipation_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/LAHAproject/InAViT?style=flat)](https://github.com/LAHAproject/InAViT) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Roy_Interaction_Region_Visual_Transformer_for_Egocentric_Action_Anticipation_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2211.14154-b31b1b.svg)](http://arxiv.org/abs/2211.14154) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=XhI6ZNhbKbc) |
| [Object-Centric Video Representation for Long-Term Action Anticipation](https://openaccess.thecvf.com/content/WACV2024/html/Zhang_Object-Centric_Video_Representation_for_Long-Term_Action_Anticipation_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/brown-palm/ObjectPrompt?style=flat)](https://github.com/brown-palm/ObjectPrompt) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_Object-Centric_Video_Representation_for_Long-Term_Action_Anticipation_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.00180-b31b1b.svg)](http://arxiv.org/abs/2311.00180) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=VH9rQULIGoU) |
| [A Hybrid Graph Network for Complex Activity Detection in Video](https://openaccess.thecvf.com/content/WACV2024/html/Khan_A_Hybrid_Graph_Network_for_Complex_Activity_Detection_in_Video_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/salmank255/CompAD?style=flat)](https://github.com/salmank255/CompAD) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Khan_A_Hybrid_Graph_Network_for_Complex_Activity_Detection_in_Video_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2211.14154-b31b1b.svg)](http://arxiv.org/abs/2310.17493) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=LR2aFtTx2zE) |
| [SSVOD: Semi-Supervised Video Object Detection with Sparse Annotations](https://openaccess.thecvf.com/content/WACV2024/html/Mahmud_SSVOD_Semi-Supervised_Video_Object_Detection_With_Sparse_Annotations_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/enyac-group/SSVOD?style=flat)](https://github.com/enyac-group/SSVOD) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Mahmud_SSVOD_Semi-Supervised_Video_Object_Detection_With_Sparse_Annotations_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2309.01391-b31b1b.svg)](http://arxiv.org/abs/2309.01391) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=BOC7iL6goSk) |
| [Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval](https://openaccess.thecvf.com/content/WACV2024/html/Huang_Semantic_Fusion_Augmentation_and_Semantic_Boundary_Detection_A_Novel_Approach_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/basiclab/SFABD?style=flat)](https://github.com/basiclab/SFABD) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Huang_Semantic_Fusion_Augmentation_and_Semantic_Boundary_Detection_A_Novel_Approach_WACV_2024_paper.pdf) | :heavy_minus_sign: |
| [A Coarse-to-Fine Pseudo-Labeling (C2FPL) Framework for Unsupervised Video Anomaly Detection](https://openaccess.thecvf.com/content/WACV2024/html/Al-lahham_A_Coarse-To-Fine_Pseudo-Labeling_C2FPL_Framework_for_Unsupervised_Video_Anomaly_Detection_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/AnasEmad11/C2FPL?style=flat)](https://github.com/AnasEmad11/C2FPL) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Al-lahham_A_Coarse-To-Fine_Pseudo-Labeling_C2FPL_Framework_for_Unsupervised_Video_Anomaly_Detection_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2310.17650-b31b1b.svg)](http://arxiv.org/abs/2310.17650) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=-zwWy0TOzxs) |
| [PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data](https://openaccess.thecvf.com/content/WACV2024/html/Herzig_PromptonomyViT_Multi-Task_Prompt_Learning_Improves_Video_Transformers_Using_Synthetic_Scene_WACV_2024_paper.html) | [![GitHub Page](https://img.shields.io/badge/GitHub-Page-159957.svg)](https://ofir1080.github.io/PromptonomyViT/) <br /> [![GitHub](https://img.shields.io/github/stars/ofir1080/PromptonomyViT?style=flat)](https://github.com/ofir1080/PromptonomyViT) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Herzig_PromptonomyViT_Multi-Task_Prompt_Learning_Improves_Video_Transformers_Using_Synthetic_Scene_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2212.04821-b31b1b.svg)](http://arxiv.org/abs/2212.04821) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=NW7cHEAErt8) |
| [GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap](https://openaccess.thecvf.com/content/WACV2024/html/Lee_GLAD_Global-Local_View_Alignment_and_Background_Debiasing_for_Unsupervised_Video_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Lee_GLAD_Global-Local_View_Alignment_and_Background_Debiasing_for_Unsupervised_Video_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.12467-b31b1b.svg)](http://arxiv.org/abs/2311.12467) | :heavy_minus_sign: |
| [Beyond SOT: Tracking Multiple Generic Objects at Once](https://openaccess.thecvf.com/content/WACV2024/html/Mayer_Beyond_SOT_Tracking_Multiple_Generic_Objects_at_Once_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/visionml/pytracking?style=flat)](https://github.com/visionml/pytracking) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Mayer_Beyond_SOT_Tracking_Multiple_Generic_Objects_at_Once_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2212.11920-b31b1b.svg)](http://arxiv.org/abs/2212.11920) | :heavy_minus_sign: |
| [MFT: Long-Term Tracking of Every Pixel](https://openaccess.thecvf.com/content/WACV2024/html/Neoral_MFT_Long-Term_Tracking_of_Every_Pixel_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/serycjon/MFT?style=flat)](https://github.com/serycjon/MFT) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Neoral_MFT_Long-Term_Tracking_of_Every_Pixel_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2305.12998-b31b1b.svg)](http://arxiv.org/abs/2305.12998) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=6VwtqnfQmHg) |
| [Real-Time Weakly Supervised Video Anomaly Detection](https://openaccess.thecvf.com/content/WACV2024/html/Karim_Real-Time_Weakly_Supervised_Video_Anomaly_Detection_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Karim_Real-Time_Weakly_Supervised_Video_Anomaly_Detection_WACV_2024_paper.pdf) | :heavy_minus_sign: |
| [Single-Image Deblurring, Trajectory and Shape Recovery of Fast Moving Objects with Denoising Diffusion Probabilistic Models](https://openaccess.thecvf.com/content/WACV2024/html/Spetlik_Single-Image_Deblurring_Trajectory_and_Shape_Recovery_of_Fast_Moving_Objects_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/radimspetlik/SI-DDPM-FMO?style=flat)](https://github.com/radimspetlik/SI-DDPM-FMO) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Spetlik_Single-Image_Deblurring_Trajectory_and_Shape_Recovery_of_Fast_Moving_Objects_WACV_2024_paper.pdf) | :heavy_minus_sign: |
| [Contrastive Learning for Multi-Object Tracking with Transformers](https://openaccess.thecvf.com/content/WACV2024/html/De_Plaen_Contrastive_Learning_for_Multi-Object_Tracking_With_Transformers_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/De_Plaen_Contrastive_Learning_for_Multi-Object_Tracking_With_Transformers_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.08043-b31b1b.svg)](http://arxiv.org/abs/2311.08043) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=aTLua4QXl0s) |
| [Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders](https://openaccess.thecvf.com/content/WACV2024/html/Das_Limited_Data_Unlimited_Potential_A_Study_on_ViTs_Augmented_by_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/dominickrei/Limited-data-vits?style=flat)](https://github.com/dominickrei/Limited-data-vits) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Das_Limited_Data_Unlimited_Potential_A_Study_on_ViTs_Augmented_by_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2310.20704-b31b1b.svg)](http://arxiv.org/abs/2310.20704) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=yRBuBbUggA0) |
| [JOADAA: Joint Online Action Detection and Action Anticipation](https://openaccess.thecvf.com/content/WACV2024/html/Guermal_JOADAA_Joint_Online_Action_Detection_and_Action_Anticipation_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Guermal_JOADAA_Joint_Online_Action_Detection_and_Action_Anticipation_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2309.06130-b31b1b.svg)](http://arxiv.org/abs/2309.06130) | :heavy_minus_sign: |
| [CCMR: High Resolution Optical Flow Estimation via Coarse-to-Fine Context-Guided Motion Reasoning](https://openaccess.thecvf.com/content/WACV2024/html/Jahedi_CCMR_High_Resolution_Optical_Flow_Estimation_via_Coarse-To-Fine_Context-Guided_Motion_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/cv-stuttgart/CCMR?style=flat)](https://github.com/cv-stuttgart/CCMR) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Jahedi_CCMR_High_Resolution_Optical_Flow_Estimation_via_Coarse-To-Fine_Context-Guided_Motion_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.02661-b31b1b.svg)](http://arxiv.org/abs/2311.02661) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=8grm1D97Lqk) |
| [Weakly-Supervised Representation Learning for Video Alignment and Analysis](https://openaccess.thecvf.com/content/WACV2024/html/Bar-Shalom_Weakly-Supervised_Representation_Learning_for_Video_Alignment_and_Analysis_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Bar-Shalom_Weakly-Supervised_Representation_Learning_for_Video_Alignment_and_Analysis_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2302.04064-b31b1b.svg)](http://arxiv.org/abs/2302.04064) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=y4dboMr_t2I) |
| [MotionAGFormer: Enhancing 3D Human Pose Estimation with a Transformer-GCNFormer Network](https://openaccess.thecvf.com/content/WACV2024/html/Mehraban_MotionAGFormer_Enhancing_3D_Human_Pose_Estimation_With_a_Transformer-GCNFormer_Network_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/TaatiTeam/MotionAGFormer?style=flat)](https://github.com/TaatiTeam/MotionAGFormer) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Mehraban_MotionAGFormer_Enhancing_3D_Human_Pose_Estimation_With_a_Transformer-GCNFormer_Network_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2310.16288-b31b1b.svg)](http://arxiv.org/abs/2310.16288) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=r6LzqV1cWag) |
| [Leveraging Synthetic Data to Learn Video Stabilization Under Adverse Conditions](https://openaccess.thecvf.com/content/WACV2024/html/Kerim_Leveraging_Synthetic_Data_To_Learn_Video_Stabilization_Under_Adverse_Conditions_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/A-Kerim/SyntheticData4VideoStabilization_WACV_2024?style=flat)](https://github.com/A-Kerim/SyntheticData4VideoStabilization_WACV_2024) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Kerim_Leveraging_Synthetic_Data_To_Learn_Video_Stabilization_Under_Adverse_Conditions_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2208.12763-b31b1b.svg)](http://arxiv.org/abs/2208.12763) | :heavy_minus_sign: |
| [What's in the Flow? Exploiting Temporal Motion Cues for Unsupervised Generic Event Boundary Detection](https://openaccess.thecvf.com/content/WACV2024/html/Gothe_Whats_in_the_Flow_Exploiting_Temporal_Motion_Cues_for_Unsupervised_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Gothe_Whats_in_the_Flow_Exploiting_Temporal_Motion_Cues_for_Unsupervised_WACV_2024_paper.pdf) | :heavy_minus_sign: |
| [Learning the what and how of Annotation in Video Object Segmentation](https://openaccess.thecvf.com/content/WACV2024/html/Delatolas_Learning_the_What_and_How_of_Annotation_in_Video_Object_WACV_2024_paper.html) | [![WEB Page](https://img.shields.io/badge/WEB-Page-159957.svg)](https://eva-vos.compute.dtu.dk/) <br /> [![GitHub](https://img.shields.io/github/stars/thanosDelatolas/eva-vos?style=flat)](https://github.com/thanosDelatolas/eva-vos) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Delatolas_Learning_the_What_and_How_of_Annotation_in_Video_Object_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.04414-b31b1b.svg)](http://arxiv.org/abs/2311.04414) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=SBLETVHONtc) |
| [Lightweight Delivery Detection on Doorbell Cameras](https://openaccess.thecvf.com/content/WACV2024/html/Khorramshahi_Lightweight_Delivery_Detection_on_Doorbell_Cameras_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Khorramshahi_Lightweight_Delivery_Detection_on_Doorbell_Cameras_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2305.07812-b31b1b.svg)](http://arxiv.org/abs/2305.07812) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=hjc7DnHXTz8) |
| [Spatio-Temporal Filter Analysis Improves 3D-CNN for Action Classification](https://openaccess.thecvf.com/content/WACV2024/html/Kobayashi_Spatio-Temporal_Filter_Analysis_Improves_3D-CNN_for_Action_Classification_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Kobayashi_Spatio-Temporal_Filter_Analysis_Improves_3D-CNN_for_Action_Classification_WACV_2024_paper.pdf) | :heavy_minus_sign: |
| [PMI Sampler: Patch Similarity Guided Frame Selection for Aerial Action Recognition](https://openaccess.thecvf.com/content/WACV2024/html/Xian_PMI_Sampler_Patch_Similarity_Guided_Frame_Selection_for_Aerial_Action_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/Ricky-Xian/PMI-Sampler?style=flat)](https://github.com/Ricky-Xian/PMI-Sampler) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Xian_PMI_Sampler_Patch_Similarity_Guided_Frame_Selection_for_Aerial_Action_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2304.06866-b31b1b.svg)](http://arxiv.org/abs/2304.06866) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=BBzvG8i6BuY) |
| [Optimizing Long-Term Robot Tracking with Multi-Platform Sensor Fusion](https://openaccess.thecvf.com/content/WACV2024/html/Albanese_Optimizing_Long-Term_Robot_Tracking_With_Multi-Platform_Sensor_Fusion_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Albanese_Optimizing_Long-Term_Robot_Tracking_With_Multi-Platform_Sensor_Fusion_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=bK8w-YCNTpc) |
| [Learnable Cube-based Video Encryption for Privacy-Preserving Action Recognition](https://openaccess.thecvf.com/content/WACV2024/html/Ishikawa_Learnable_Cube-Based_Video_Encryption_for_Privacy-Preserving_Action_Recognition_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Ishikawa_Learnable_Cube-Based_Video_Encryption_for_Privacy-Preserving_Action_Recognition_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=mPMnW5lnnZo) |
| [A*: Atrous Spatial Temporal Action Recognition for Real Time Applications](https://openaccess.thecvf.com/content/WACV2024/html/Kim_A_Atrous_Spatial_Temporal_Action_Recognition_for_Real_Time_Applications_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Kim_A_Atrous_Spatial_Temporal_Action_Recognition_for_Real_Time_Applications_WACV_2024_paper.pdf) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=LY4ZlXt_pXM) |
| [SEMA: Semantic Attention for Capturing Long-Range Dependencies in Egocentric Lifelogs](https://openaccess.thecvf.com/content/WACV2024/html/Nagar_SEMA_Semantic_Attention_for_Capturing_Long-Range_Dependencies_in_Egocentric_Lifelogs_WACV_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/Pravin74/Semantic_attention?style=flat)](https://github.com/Pravin74/Semantic_attention) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Nagar_SEMA_Semantic_Attention_for_Capturing_Long-Range_Dependencies_in_Egocentric_Lifelogs_WACV_2024_paper.pdf) | :heavy_minus_sign: |
| [Triplet Attention Transformer for Spatiotemporal Predictive Learning](https://openaccess.thecvf.com/content/WACV2024/html/Nie_Triplet_Attention_Transformer_for_Spatiotemporal_Predictive_Learning_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Nie_Triplet_Attention_Transformer_for_Spatiotemporal_Predictive_Learning_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2310.18698-b31b1b.svg)](http://arxiv.org/abs/2310.18698) | [![YouTube](https://img.shields.io/badge/YouTube-%23FF0000.svg?style=for-the-badge&logo=YouTube&logoColor=white)](https://www.youtube.com/watch?v=eMTDabWqnjk) |
| [ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection](https://openaccess.thecvf.com/content/WACV2024/html/Phan_ZEETAD_Adapting_Pretrained_Vision-Language_Model_for_Zero-Shot_End-to-End_Temporal_Action_WACV_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024/papers/Phan_ZEETAD_Adapting_Pretrained_Vision-Language_Model_for_Zero-Shot_End-to-End_Temporal_Action_WACV_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.00729-b31b1b.svg)](http://arxiv.org/abs/2311.00729) | :heavy_minus_sign: |
